# Querying Rows Using Core and ORM


This chapter covers the most frequently used Select in SQLAlchemy.


# Constructing SQL Expressions with select()

The select() constructor allows you to create query statements in the same way as the insert() constructor.

>>> from sqlalchemy import select
>>> stmt = select(user_table).where(user_table.c.name == 'spongebob')
>>> print(stmt)
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1
"""

Similarly, you can put a query in the Connection.execute() method to execute a query statement, just like any SQL constructor at the same level (select, insert, update,create and etc.).

>>> with engine.connect() as conn:
...     for row in conn.execute(stmt):
...         print(row)
(1, 'spongebob', 'Spongebob Squarepants')

On the other hand, if you want to use the ORM to execute a select query statement, you should use Session.exeuct().

The result returns a Row object, just like in the example just now. This object contains the User object that we defined in the previous tutorial (opens new window).

>>> stmt = select(User).where(User.name == 'spongebob')
>>> with Session(engine) as session:
...     for row in session.execute(stmt):
...         # Print each row in an instance of the User object
...         print(row)
(User(id=1, name='spongebob', fullname='Spongebob Squarepants'),)

# Setting up the FROM clause and columns

The select() function can take a variety of objects as positional arguments, including Column and Table.

These argument values can be represented as the return value of the select() function, i.e., as an SQL query statement, and can also set the FROM clause.

>>> print(select(user_table))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
"""

To retrieve each column using the Core, you can access the Column object through the Table.c accessor.

>>> print(select(user_table.c.name, user_table.c.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""

# ORM entity and column lookups

When implementing SQL queries in SQLAlchemy, you can use ORM entities like the User object or attributes that map to columns, such as User.name, to represent tables or columns. The example below queries the User entity, but in fact, the result is the same as when using user_table.

The example below looks up the User entity, but the results are the same as when using user_table.

>>> print(select(User))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
"""

In the above example, the query can be executed in the same way using ORM's Session.execute().

However, there is a difference between querying the User entity and querying user_info. Whether you query user_info or the User entity, in both cases a Row object is returned.

But, when querying the User entity, the returned Row object includes a User instance.

Tips:

The user_table and User were created in the previous chapter (opens new window), where user_table is a Table object, and User is an entity that inherits from the Base object and includes a Table object."

>>> with Session(engine) as session:
...     row = session.execute(select(User)).first()
...     print(row)
(User(id=1, name='spongebob',fullname='Spongebob Squarepants'),)

Alternatively, you can query the desired columns using object attributes(class-bound attributes).

>>> print(select(User.name, User.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""

When querying object attributes using Session.execute(), the values of the object attributes sent as arguments (column values) are returned as follows.

>>> with Session(engine) as session:
...	row = session.execute(select(User.name, User.fullname)).first()
...	print(row)
('spongebob', 'Spongebob Squarepants')

These methods can also be mixed and used together, as shown in the following example

>>> session.execute(
...     select(User.name, Address).
...     where(User.id==Address.user_id).
...     order_by(Address.id)
... ).all()
[('spongebob', Address(id=1, email_address='spongebob@sqlalchemy.org')),
('sandy', Address(id=2, email_address='sandy@sqlalchemy.org')),
('sandy', Address(id=3, email_address='sandy@squirrelpower.org'))]

# Querying Labeled SQL Expressions

When you execute a query like SELECT name AS username FROM user_account, you can get the following results:

username
patrick
sandy
spongebob

Here, we've labeled the name column as username, which is why username appears as the column header. This functionality can be implemented in SQLAlchemy using the ColumnElement.label() function, as shown below:

>>> from sqlalchemy import func, cast
>>> stmt = (
...     select(
...         # Labeling is done like this.
...         ("Username: " + user_table.c.name).label("username"),
...     ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
...     for row in conn.execute(stmt):
...         # The labeled part can be accessed like this.
...         print(f"{row.username}")
Username: patrick
Username: sandy
Username: spongebob

# Querying String Columns

Usually, columns are queried using the Select object or the select() constructor, but sometimes you need to query a column along with an arbitrary string. This section covers how to query such string data.

The text() constructor was introduced in a previous chapter 3 (opens new window). Transactions and Database API Operations. It allows you to directly use a SELECT statement within it.

Let's consider a scenario where we want to execute a query like SELECT 'some_phrase', name FROM user_account. In this case, since some_phrase is a string, it must be enclosed in either single or double quotes. Consequently, the output will inevitably have single quotes around the string.

>>> from sqlalchemy import text
>>> stmt = (
...     select(
...         text("'some phrase'"), user_table.c.name
...     ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
...     print(conn.execute(stmt).all())
[('some phrase', 'patrick'), ('some phrase', 'sandy'), ('some phrase', 'spongebob')]

Therefore, instead of text(), it is common to use literal_column() to solve the problem of having single quotes attached to the output. text() and literal_column() are almost similar, but literal_column() explicitly signifies a column and can be labeled for use in subqueries and other SQL expressions."

>>> from sqlalchemy import literal_column
>>> stmt = (
...     select(
...         literal_column("'some phrase'").label("p"), user_table.c.name
...     ).order_by(user_table.c.name)
... )
>>> with engine.connect() as conn:
...     for row in conn.execute(stmt):
...         print(f"{row.p}, {row.name}")
some phrase, patrick
some phrase, sandy
some phrase, spongebob


# WHERE Clauses

Using SQLAlchemy, you can easily write queries to output data where conditions like name = 'thead' or user_id > 10 are met using Python operators.

>>> print(user_table.c.name == 'squidward')
user_account.name = :name_1

>>> print(address_table.c.user_id > 10)
address.user_id > :user_id_1

To create a WHERE clause, you can pass arguments to the Select.where() method.

>>> print(select(user_table).where(user_table.c.name == 'squidward'))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1
"""

When implementing a JOIN with a WHERE clause, it can be written as follows.

>>> print(
...         select(address_table.c.email_address).
...         where(user_table.c.name == 'squidward').
...         where(address_table.c.user_id == user_table.c.id)
... )
"""
SELECT address.email_address
FROM address, user_account
WHERE user_account.name = :name_1 AND address.user_id = user_account.id
"""

# The same expression can be used, but you can also pass parameters to the where() method.
>>> print(
        select(address_table.c.email_address).
...         where(
...             user_table.c.name == 'squidward',
...             address_table.c.user_id == user_table.c.id
...         )
... )
"""
SELECT address.email_address
FROM address, user_account
WHERE user_account.name = :name_1 AND address.user_id = user_account.id
"""

It's also possible to use conjunctions such as and_() and or_().

>>> from sqlalchemy import and_, or_
>>> print(
...     select(Address.email_address).
...         where(
...             and_(
...                 or_(User.name == 'squidward', User.name == 'sandy'),
...                 Address.user_id == User.id
...             )
...         )
...  )
"""
SELECT address.email_address
FROM address, user_account
WHERE (user_account.name = :name_1 OR user_account.name = :name_2)
AND address.user_id = user_account.id
"""

For simple equality or inequality comparisons, Select.filter_by() is often used.

>>> print(
...     select(User).filter_by(name='spongebob', fullname='Spongebob Squarepants')
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account
WHERE user_account.name = :name_1 AND user_account.fullname = :fullname_1
"""

# Specifying the FROM Clause and JOINs

As mentioned before, the FROM clause is automatically set based on the columns included as arguments in the select() method, without the need for explicit specification.

#     Even without specifying the FROM clause explicitly, it is set and displayed in the output.
>>> print(select(user_table.c.name))
"""
SELECT user_account.name
FROM user_account
"""

If you want to reference columns from two different tables in the positional arguments of select(), you can separate them with a comma (,).

>>> print(select(user_table.c.name, address_table.c.email_address))
"""
SELECT user_account.name, address.email_address
FROM user_account, address
"""

If you want to join two different tables, there are two methods you can use:

One is the Select.join() method, which allows you to explicitly specify the left and right tables for the JOIN.

>>> print(
...     select(user_table.c.name, address_table.c.email_address).
...     join_from(user_table, address_table)
... )
"""
SELECT user_account.name, address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""

The other is to explicitly specify only the right table in the Select.join() method, and let the other table be implicitly referenced when selecting columns.

# This expression is the same, but the left table to join (user_table) is expressed implicitly.
>>> print(
...     select(user_table.c.name, address_table.c.email_address).
...     join(address_table)
... )
"""
SELECT user_account.name, address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""

Alternatively, if you want to write the two JOINing tables more explicitly, or if you want to provide explicit additional options in the FROM clause, you can write it as follows.

>>> print(
...     select(address_table.c.email_address).
...     select_from(user_table).join(address_table)
... )
"""
SELECT address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""

Another case for using Select.select_from() is when we cannot implicitly set the FROM clause through the columns we want to query.

For example, to query count(*) in a typical SQL query, you would need to use sqlalchemy.sql.expression.func from SQLAlchemy.

>>> from sqlalchemy import func
>>> print(select(func.count('*')).select_from(user_table))
"""
SELECT count(:count_2) AS count_1
FROM user_account
"""

# Setting the ON Clause

But there was something unusual, wasn't there? In fact, in the previous example, when joining two tables using Select.select_from() or select.join(), the ON clause was implicitly set.

This automatic setting of the ON clause happened because the user_table and address_table objects have a ForeignKeyConstraint, i.e., a foreign key constraint, which led to the automatic setting.

If the two tables targeted for a Join lack such constraint keys, you must explicitly specify the ON clause. This functionality can be explicitly set by passing parameters to the Select.join() or Select.join_from() methods for the ON clause.

>>> print(
...     select(address_table.c.email_address).
...     select_from(user_table).
...     join(address_table, user_table.c.id == address_table.c.user_id)
... )
"""
SELECT address.email_address
FROM user_account JOIN address ON user_account.id = address.user_id
"""

# OUTER, FULL Join

To implement LEFT OUTER JOIN or FULL OUTER JOIN In SQLAlchemy, you can use the keyword arguments Select.join.isouter and Select.join.full in the Select.join() and Select.join_from() methods.

An examples of implementing the LEFT OUTER JOIN:

>>> print(
...     select(user_table).join(address_table, isouter=True)
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account LEFT OUTER JOIN address ON user_account.id = address.user_id
"""

An examples of implementing the FULL OUTER JOIN:

>>> print(
...     select(user_table).join(address_table, full=True)
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account FULL OUTER JOIN address ON user_account.id = address.user_id
"""

# ORDER BY, GROUP BY, HAVING

  • The ORDER BY clause allows you to set the order of the rows retrieved by the SELECT clause.
  • The GROUP BY clause creates groups based on a specific column for rows aggregated by group functions.
  • HAVING applies conditions to groups created by the GROUP BY clause.

# ORDER BY

You can implement the ORDER BY feature using Select.order_by(). This method accepts Column objects or similar objects as positional arguments.

>>> print(select(user_table).order_by(user_table.c.name))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account ORDER BY user_account.name
"""

Ascending and descending order can be implemented using the ColumnElement.asc() and ColumnElement.desc() modifiers, respectively.

The following example orders by the user_account.fullname column in descending order.

>>> print(select(User).order_by(User.fullname.desc()))
"""
SELECT user_account.id, user_account.name, user_account.fullname
FROM user_account ORDER BY user_account.fullname DESC
"""

# Aggregations: GROUP BY, HAVING

In SQL, aggregate functions can also be used to combine multiple rows into a single row. Examples of aggregate functions include COUNT(), SUM(), and AVG().

SQLAlchemy provides SQL functions using the func namespace, where func creates a Function instance when given the name of an SQL function.

In the example below, the count() function is called to render the user_account.id column with the SQL COUNT() function.

>>> from sqlalchemy import func
>>> count_fn = func.count(user_table.c.id)
>>> print(count_fn)
"""
count(user_account.id)
"""

More details about SQL functions are explained in Handling SQL Functions.

To summarize:

GROUP BY is a function needed to divide the retrieved rows into specific groups. In SQL, if a few columns are queried in the SELECT clause, these columns are directly or indirectly dependent on the primary key in the GROUP BY.

HAVING is necessary to apply conditions to groups created by GROUP BY (similar to the WHERE clause because it places conditions on groups).

In SQLAlchemy, GROUP BY and HAVING can be implemented using Select.group_by() and Select.having().

>>> with engine.connect() as conn:
...     result = conn.execute(
...         select(User.name, func.count(Address.id).label("count")).
...         join(Address).
...         group_by(User.name).
...         having(func.count(Address.id) > 1)
...     )
...     print(result.all())
""" The syntax above represents the SQL statement below.
SELECT user_account.name, count(address.id) AS count
FROM user_account JOIN address ON user_account.id = address.user_id GROUP BY user_account.name
HAVING count(address.id) > ?
[...] (1,)
"""

[('sandy', 2)]

# Grouping or ordering by alias

In some database backends, when using aggregate functions to query tables, it is important not to restate already specified aggregate functions in the ORDER BY or GROUP BY clauses.

# NOT GOOD
SELECT id, COUNT(id) FROM user_account GROUP BY id ORDER BY count(id)

# CORRECT
SELECT id, COUNT(id) as cnt_id FROM user_account GROUP BY id ORDER BY cnt_id

Therefore, to implement ORDER BY or GROUP BY using aliases, you just need to insert the alias you want to use as an argument in the Select.order_by() or Select.group_by() methods.

The alias used here is not rendered first; instead, the alias used in the column clause is rendered first. If the rendered alias does not match anything in the rest of the query, an error occurs.

>>> from sqlalchemy import func, desc
>>> # The alias 'num_addresses' is used in both the column and in the order_by clause.
>>> stmt = select(
...         Address.user_id,
...         func.count(Address.id).label('num_addresses')).\
...         group_by("user_id").order_by("user_id", desc("num_addresses"))
>>> print(stmt)
"""
SELECT address.user_id, count(address.id) AS num_addresses
FROM address GROUP BY address.user_id ORDER BY address.user_id, num_addresses DESC
"""

# Using aliases

When using JOIN to query multiple tables, it's often necessary to repeatedly write the table names in the query.

In SQL, this issue can be addressed by giving aliases to table names or subqueries, reducing repetition.

In SQLAlchemy, such aliases can be implemented using the Core's FromClause.alias() function.

Within the Table object namespace, there are Column objects, allowing access to column names via Table.c.

print(select(user_table.c.name, user_table.c.fullname))
"""
SELECT user_account.name, user_account.fullname
FROM user_account
"""

Similarly, in the Alias object namespace, there are Column objects, making it possible to access columns via Alias.c.

>>> # Both user_alias_1 and user_alias_2 are Alias objects.
>>> user_alias_1 = user_table.alias(‘table1’) 
>>> user_alias_2 = user_table.alias(‘table2’)
>>> # To access columns using the newly created table aliases,
>>> # you should use Alias.c.column_name
>>> print(
...     select(user_alias_1.c.name, user_alias_2.c.name).
...     join_from(user_alias_1, user_alias_2, user_alias_1.c.id > user_alias_2.c.id)
... )

"""
SELECT table1.name, table2.name AS name_1 
FROM user_account AS table1 JOIN user_account AS table2 ON table1.id > table2.id
"""

# ORM Entity Aliases

The ORM in SQLAlchemy also has a function similar to the FromClause.alias() method, known as aliased()

This ORM aliased() function internally creates an Alias object for the originally mapped Table object, while maintaining ORM functionalities.

Tips:

The user_table and User were created in the previous chapter (opens new window), where user_table is a Table object, and User is an entity that inherits from the Base object and includes a Table object."

>>> user_alias_1 = user_table.alias()
>>> user_alias_2 = user_table.alias()
>>> # In the examples, it is applied to the User or Address entities.
>>> print(
...     select(User).
...     join_from(User, address_alias_1).
...     where(address_alias_1.email_address == 'patrick@aol.com').
...     join_from(User, address_alias_2).
...     where(address_alias_2.email_address == 'patrick@gmail.com')
... )
"""
SELECT user_account.id, user_account.name, user_account.fullname 
FROM user_account JOIN address AS address_1 ON user_account.id = address_1.user_id JOIN address AS address_2 ON user
"""

# Subqueries and CTE(Common Table Expression)s

This section explains subqueries typically found in the FROM clause of a SELECT statement. It also covers CTEs (Common Table Expressions), which are used in a similar way to subqueries but with additional functionalities.

More about CTE

A CTE is a temporary result set within a query that can be referenced multiple times within the same query. You can check the official links to describe CTEs in RDBMS below.

SQLAlchemy represents subqueries using the Subquery object created by Select.subquery(), and CTEs are represented using Select.cte().

>>> subq = select(
...     func.count(address_table.c.id).label("count"),
...     address_table.c.user_id
... ).group_by(address_table.c.user_id).subquery()
>>> print(subq)
"""
SELECT count(address.id) AS count, address.user_id
FROM address GROUP BY address.user_id
"""

>>> # The ON clause automatically binds two tables
>>> # that are already constrained by a foreign key.
>>> stmt = select(
...    user_table.c.name,
...    user_table.c.fullname,
...    subq.c.count
... ).join_from(user_table, subq)
>>> print(stmt)
"""
SELECT user_account.name, user_account.fullname, anon_1.count
FROM user_account JOIN (SELECT count(address.id) AS count, address.user_id AS user_id
FROM address GROUP BY address.user_id) AS anon_1 ON user_account.id = anon_1.user_id
"""

# Hierarchy Query

The method of using CTE syntax in SQLAlchemy is almost identical to how subquery syntax is used. Instead of calling the Select.subquery() method, you use Select.cte(), allowing the resulting object to be used as a FROM element.

>>> subq = select(
...     func.count(address_table.c.id).label("count"),
...     address_table.c.user_id
... ).group_by(address_table.c.user_id).cte()

>>> stmt = select(
...    user_table.c.name,
...    user_table.c.fullname,
...    subq.c.count
... ).join_from(user_table, subq)

>>> print(stmt)
"""
WITH anon_1 AS
(SELECT count(address.id) AS count, address.user_id AS user_id
FROM address GROUP BY address.user_id)
 SELECT user_account.name, user_account.fullname, anon_1.count
FROM user_account JOIN anon_1 ON user_account.id = anon_1.user_id
"""

# ORM Entity Subqueries, CTE

You can see that aliased() performs the same operation for Subquery and CTE subqueries.

>>> subq = select(Address).where(~Address.email_address.like('%@aol.com')).subquery()
>>> address_subq = aliased(Address, subq)
>>> stmt = select(User, address_subq).join_from(User, address_subq).order_by(User.id, address_subq.id)
>>> with Session(engine) as session:
...     for user, address in session.execute(stmt):
...         print(f"{user} {address}")

""" The above syntax represents the following query:

SELECT user_account.id, user_account.name, user_account.fullname,
anon_1.id AS id_1, anon_1.email_address, anon_1.user_id
FROM user_account JOIN
(SELECT address.id AS id, address.email_address AS email_address, address.user_id AS user_id
FROM address
WHERE address.email_address NOT LIKE ?) AS anon_1 ON user_account.id = anon_1.user_id
ORDER BY user_account.id, anon_1.id
[...] ('%@aol.com',)
"""

User(id=1, name='spongebob', fullname='Spongebob Squarepants') Address(id=1, email_address='spongebob@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=2, email_address='sandy@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=3, email_address='sandy@squirrelpower.org')

Below is an example of querying the same result using the CTE constructor:

>>> cte = select(Address).where(~Address.email_address.like('%@aol.com')).cte()
>>> address_cte = aliased(Address, cte)
>>> stmt = select(User, address_cte).join_from(User, address_cte).order_by(User.id, address_cte.id)
>>> with Session(engine) as session:
...     for user, address in session.execute(stmt):
...         print(f"{user} {address}")

User(id=1, name='spongebob', fullname='Spongebob Squarepants') Address(id=1, email_address='spongebob@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=2, email_address='sandy@sqlalchemy.org')
User(id=2, name='sandy', fullname='Sandy Cheeks') Address(id=3, email_address='sandy@squirrelpower.org')

# Scalar Subqueries and Correlated Queries

Before explaining scalar subqueries, let's briefly discuss subqueries in SQL. 출처:바이헨 블로그 (opens new window)

A "subquery" is a SELECT statement within another SQL statement, and the outer SQL statement is referred to as the "main query".

The types of subqueries are determined based on whether they reference columns of the main query, where they are declared, and the number of rows they return.

  • Classification based on reference to main query columns:
    • Correlated Subqueries: The subquery references columns of the main query.
    • Non-correlated Subqueries: The subquery does not reference the main query's columns and operates independently, used to convey information to the main query.
  • Classification based on declaration position:
    • Scalar Subqueries: Subqueries that appear in the column position of a SELECT statement (correlated).
    • Inline Views: Subqueries in the FROM clause (correlated).
    • Nested Subqueries: Subqueries in the WHERE clause (non-correlated).
  • Classification based on the number of rows returned:
    • Single-row Subqueries (return one row)
    • Multi-row Subqueries (return more than one row): Used with IN, ANY, ALL, EXISTS.

In SQLAlchemy, scalar subqueries use ScalarSelect, which is part of the ColumnElement object, while general subqueries use Subquery, which is in the FromClause object.

Scalar subqueries are often used as described earlier in Aggregations.

# Implementing Scalar subquery using Select.scalar_subquery()
>>> subq = select(func.count(address_table.c.id)).
...             where(user_table.c.id == address_table.c.user_id).
...             scalar_subquery()
>>> print(subq) # ... is equal to "ScalarSelect" type
"""
(SELECT count(address.id) AS count_1
FROM address, user_account
WHERE user_account.id = address.user_id)
"""

Scalar subqueries implemented using Select.scalar_subquery() render the user_account and address in the FROM clause, but since the user_account table is already present in the main query, it is not rendered again in the scalar subquery.

>>> stmt = select(user_table.c.name, subq.label("address_count"))
>>> print(stmt)
"""
SELECT user_account.name, (SELECT count(address.id) AS count_1
FROM address
WHERE user_account.id = address.user_id) AS address_count
FROM user_account
"""

Meanwhile, when writing correlated queries, the connections between tables can become ambiguous.

I did not understand the correlated query example in the tutorial. If someone understands it well, please contribute to this document.


# UNION, UNION ALL operators

In SQL, terms like UNION and UNION ALL are used to combine two SELECT statements. Queries can be executed as shown below.

SELECT id FROM user_account
union 
SELECT email_address FROM address

Additionally, SQL supports set operations like INTERSECT (intersection) and EXCEPT (difference). In SQLAlchemy, for Select objects, functions such as union(), intersect(), except_(), union_all(), intersect_all(), and except_all() are available.

The return value of these functions is a CompoundSelect, which is an object that can be used similarly to Select but has fewer methods. The CompoundSelect object returned by union_all() can be executed with Connection.execute().

>>> from sqlalchemy import union_all
>>> stmt1 = select(user_table).where(user_table.c.name == 'sandy')
>>> stmt2 = select(user_table).where(user_table.c.name == 'spongebob')
>>> u = union_all(stmt1, stmt2) # A value u is a CompoundSelect type.
>>> with engine.connect() as conn:
...     result = conn.execute(u)
...     print(result.all())

[(2, 'sandy', 'Sandy Cheeks'), (1, 'spongebob', 'Spongebob Squarepants')]

Just as Select provides the SelectBase.subquery() method to create Subquery objects, CompoundSelect objects can similarly be used as subqueries.

>>> u_subq = u.subquery()
>>> stmt = (
...     select(u_subq.c.name, address_table.c.email_address).
...     join_from(address_table, u_subq).
...     order_by(u_subq.c.name, address_table.c.email_address)
... )
>>> with engine.connect() as conn:
...     result = conn.execute(stmt)
...     print(result.all())

[('sandy', 'sandy@sqlalchemy.org'), ('sandy','sandy@squirrelpower.org'), 
('spongebob', 'spongebob@sqlalchemy.org')]

# EXISTS Subqueries

SQLAlchemy creates an Exists object through the SelectBase.exists() method to implement the EXISTS clause.

>>> # subq is a Exists type
>>> subq = (
...     select(func.count(address_table.c.id)).
...     where(user_table.c.id == address_table.c.user_id).
...     group_by(address_table.c.user_id).
...     having(func.count(address_table.c.id) > 1)
... ).exists()
>>> print(subq)
"""
EXISTS (SELECT count(address.id) AS count_1 
FROM address, user_account 
WHERE user_account.id = address.user_id GROUP BY address.user_id 
HAVING count(address.id) > :count_2)
"""
>>> with engine.connect() as conn:
...     result = conn.execute(
...         select(user_table.c.name).where(subq)
...     )
...     print(result.all())

[('sandy',)]

The EXISTS clause is more often used in a non-negated form by the way.

# This is a query to select usernames that do not have an email address.
# Take a look at the part where the '~' operator is used."
>>> subq = (
...     select(address_table.c.id).
...     where(user_table.c.id == address_table.c.user_id)
... ).exists()
>>> stmt = select(user_table.c.name).where(~subq)
>>> print(stmt)
"""
SELECT user_account.id 
FROM user_account 
WHERE NOT (EXISTS (SELECT count(address.id) AS count_1 
FROM address 
WHERE user_account.id = address.user_id GROUP BY address.user_id 
HAVING count(address.id) > :count_2))
"""
>>> with engine.connect() as conn:
...     result = conn.execute(stmt)
...     print(result.all())

[('patrick',)]

# Dealing with SQL functions.

In the earlier section Aggregations: GROUP BY, HAVING, the func object, which acts as a factory for creating new Function objects, was introduced. When using syntax like select(), you can pass SQL functions created by the func object as arguments.

  • count() : Aggregate functions are used to print the number of rows.
    >>> # cnt is a type of  <class 'sqlalchemy.sql.functions.count'>.
    >>> cnt = func.count() 
    >>> print(select(cnt).select_from(user_table))
    """
    SELECT count(*) AS count_1FROM user_account
    """
    
  • lower() : String functions are used to convert strings to lowercase.
    >>> print(select(func.lower("A String With Much UPPERCASE")))
    """
    SELECT lower(:lower_2) AS lower_1
    """
    
  • now() : There is also a function that returns the current time and date. This function is commonly used, so SQLAlchemy helps in easily rendering it across different backends.
    >>> stmt = select(func.now())
    >>> with engine.connect() as conn:
    ...     result = conn.execute(stmt)
    ...     print(result.all())
    
    [(datetime.datetime(...),)]
    

Different database backends have SQL functions with different names. Therefore, func allows access to any name in its namespace, automatically interpreting that name as an SQL function and rendering it.

>>> # A data type of crazy_function is Function.
>>> crazy_function = func.some_crazy_function(user_table.c.fullname, 17)
>>> print(select(crazy_function))
"""
SELECT some_crazy_function(user_account.name, :some_crazy_function_2) AS some_crazy_function_1
FROM user_account
"""

Meanwhile, SQLAlchemy provides appropriate data types for commonly used SQL functions like count, now, max, concat, etc., specific to each backend.

>>> from sqlalchemy.dialects import postgresql
>>> print(select(func.now()).compile(dialect=postgresql.dialect()))
"""
SELECT now() AS now_1
"""

>>> from sqlalchemy.dialects import oracle
>>> print(select(func.now()).compile(dialect=oracle.dialect()))
"""
SELECT CURRENT_TIMESTAMP AS now_1 FROM DUAL
"""

# Functions Have Return Types

I did not understand the part about 'Functions Have Return Types' in the original text. If anyone understands this, please contribute to this section. Thank you.


# Built-in Functions Have Pre-Configured Return Types

I did not understand the part about 'Built-in Functions Have Pre-Configured Return Types' in the original text. If anyone understands this, please contribute to this section. Thank you.


# WINDOW Functions

Window functions are similar to GROUP BY, created to easily define relationships between rows.

In SQLAlchemy, among all SQL functions created by the func namespace, there is the FunctionElement.over() method, which implements the OVER clause.

One of the window functions is row_number(), which counts the number of rows. You can group each row by username and then number the email addresses within each group.

# The FunctionElement.over.partition_by parameter is used
# to render the PARTITION BY clause in the OVER clause.
>>> stmt = select(
...     func.row_number().over(partition_by=user_table.c.name),
...     user_table.c.name,
...     address_table.c.email_address
... ).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:  
...     result = conn.execute(stmt)
...     print(result.all())
[(1, 'sandy', 'sandy@sqlalchemy.org'), 
 (2, 'sandy', 'sandy@squirrelpower.org'), 
 (1, 'spongebob', 'spongebob@sqlalchemy.org')]

FunctionElement.over.order_by can be used to apply an ORDER BY clause.

>>> stmt = select(
...     func.count().over(order_by=user_table.c.name),
...     user_table.c.name,
...     address_table.c.email_address).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:  
...     result = conn.execute(stmt)
...     print(result.all())

[(2, 'sandy', 'sandy@sqlalchemy.org'), 
 (2, 'sandy', 'sandy@squirrelpower.org'), 
 (3, 'spongebob', 'spongebob@sqlalchemy.org')]

# Special Modifiers like WITHIN GROUP, FILTER

The SQL clause WITHIN GROUP is used with ordered sets or hypothetical sets along with aggregate functions.

Common ordered set functions include percentile_cont() and rank().

In SQLAlchemy, functions such as rank, dense_rank, percentile_count, and percentile_disc are implemented, each with the FunctionElement.within_group() method.

>>> print(
...     func.unnest(
...         func.percentile_disc([0.25,0.5,0.75,1]).within_group(user_table.c.name)
...     )
... )
"""
unnest(percentile_disc(:percentile_disc_1) WITHIN GROUP (ORDER BY user_account.name))
"""

Some backends support the "FILTER" modifier, which can be utilized through the FunctionElement.filter() method in SQLAlchemy.

>>> stmt = select(
...     func.count(address_table.c.email_address).filter(user_table.c.name == 'sandy'),
...     func.count(address_table.c.email_address).filter(user_table.c.name == 'spongebob')
... ).select_from(user_table).join(address_table)
>>> with engine.connect() as conn:  
...     result = conn.execute(stmt)
...     print(result.all())

"""
SELECT count(address.email_address) FILTER (WHERE user_account.name = ?) AS anon_1,
count(address.email_address) FILTER (WHERE user_account.name = ?) AS anon_2
FROM user_account JOIN address ON user_account.id = address.user_id
"""

('sandy', 'spongebob')
[(2, 1)]

# Table-Valued Functions

I did not understand the part about 'Table-Valued Functions' in the original text. If anyone understands this, please contribute to this section. Thank you.


# Column Value Functions or Scalar Column (Table Valued Functions)

One of the special syntaxes supported by Oracle and PostgreSQL is functions set in the FROM clause. Examples in PostgreSQL include json_array_elements(), json_object_keys(), json_each_text(), and json_each().

SQLAlchemy refers to these functions as column values and applies them using the FunctionElement.column_valued() specifier on a Function object.

>>> from sqlalchemy import select, func
>>> stmt = select(func.json_array_elements('["one", "two"]').column_valued("x"))
>>> print(stmt)
"""
SELECT x
FROM json_array_elements(:json_array_elements_1) AS x
"""

Column value functions can also be used in Oracle as custom SQL functions, as shown below.

>>> from sqlalchemy.dialects import oracle
>>> stmt = select(func.scalar_strings(5).column_valued("s"))
>>> print(stmt.compile(dialect=oracle.dialect()))
"""
SELECT COLUMN_VALUE s
FROM TABLE (scalar_strings(:scalar_strings_1)) s
"""